Ranking Words for Building a Japanese Defining Vocabulary

نویسندگان

  • Tomoya Noro
  • Takehiro Tokuda
چکیده

Defining all words in a Japanese dictionary by using a limited number of words (defining vocabulary) is helpful for Japanese children and second-language learners of Japanese. Although some English dictionaries have their own defining vocabulary, no Japanese dictionary has such vocabulary as of yet. As the first step toward building a Japanese defining vocabulary, we ranked Japanese words based on a graphbased method. In this paper, we introduce the method, and show some evaluation results of applying the method to an existing Japanese dictionary.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Indexing and Text Ranking Method for Japanese Text Databases Using Simple-Word Compounds as Keywords

This paper describes a new indexing method for Japanese text databases using the simpie keyword string. A compound word is treated as a string of simple words, which are the smallest units in Japanese grammar which still maintain their meanings. As a result, retrieved texts can be ranked, according to the similarity of their meaning and the query, without using a control vocabulary or thesaurus...

متن کامل

Handling of Out-of-vocabulary Words in Japanese-English Machine Translation by Exploiting Parallel Corpus

A large number of loanwords and orthographic variants in Japanese pose a challenge for machine translation. In this article, we present a hybrid model for handling out-of-vocabulary words in Japanese-to-English statistical machine translation output by exploiting parallel corpus. As the Japanese writing system makes use of four different script sets (kanji, hiragana, katakana, and romaji), we t...

متن کامل

Exploiting Parallel Corpus for Handling Out-of-Vocabulary Words

This paper presents a hybrid model for handling out-of-vocabulary words in Japaneseto-English statistical machine translation output by exploiting parallel corpus. As the Japanese writing system makes use of four different script sets (kanji, hiragana, katakana, and romaji), we treat these scripts differently. A machine transliteration model is built to transliterate out-ofvocabulary Japanese k...

متن کامل

Automatic Selection of Defining Vocabulary in an Explanatory Dictionary

One of the problems in converting a conventional (human-oriented) explanatory dictionary into a semantic database intended for the use in automatic reasoning systems is that such a database should not contain any cycles in its definitions, while the traditional dictionaries usually contain them. The cycles can be eliminated by declaring some words “primitive” (having no definition) while all ot...

متن کامل

Robust Building Identification for Mobile Augmented Reality

Mobile augmented reality applications have received considerable interest in recent years, as camera equipped mobile phones become ubiquitous. We have developed a “Point and Find” application on a cell phone, where a user can point his cell phone at a building on the Stanford campus, and get relevant information of the building on his phone. The problem of recognizing buildings under different ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008